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We present the implementation of an electronic-structure approach dedicated to ionization dy¬ 
namics of molecules interacting with x-ray free-electron laser (XFEL) pulses. In our scheme, 
molecular orbitals for molecular core-hole states are represented by linear combination of numer¬ 
ical atomic orbitals that are solutions of corresponding atomic core-hole states. We demonstrate 
that our scheme efficiently calculates all possible multiple-hole configurations of molecules formed 
during XFEL pulses. The present method is suitable to investigate x-ray multiphoton multiple 
ionization dynamics and accompanying nuclear dynamics, providing essential information on the 
chemical dynamics relevant for high-intensity x-ray imaging. 
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I. INTRODUCTION 


The advent of x-ray free-electron lasers (XFELs) [l, 2] opens up a new era in science and 
technology { 3 - 5]. Unprecedentedly ultraintense and ultrafast hard x-ray pulses generated 
from XFELs enable us to measure molecular structures on the atomic scale and to explore 
the structural dynamics on the femtosecond scale. One of the most prominent XFEL appli¬ 
cations is imaging of biological macromolecules. X-ray crystallography with XFELs, after 
demonstration of the proof-of-principle [§], has started to reveal previously unknown protein 


structure Q, promising a breakthrough in structural biology (see reviews in Refs. |8Ml lj]). 


n 


Recent advances in time-resolved serial femtosecond crystallography 12-16] enable us to take 


a step forward towards molecular movies. To investigate molecular structure and structural 
dynamics with XFELs, one needs to understand radiation damage dynamics—x-ray multi¬ 
photon ionization dynamics and accompanying nuclear dynamics. Furthermore, the phase 


problem [Tj| is the bottleneck to reconstruct molecular structures in serial femtosecond 
crystallography as much as in conventional x-ray crystallography. To overcome the phase 


problem for x-ray crystallograp 


intermediate x-ray intensity 


18 


ly with XFELs, one uses conventional phasing technique at 
, or one takes an advantage of the large degree of ionization 




at high x-ray intensity. The latter brings in high-intensity phasing (HIP) methods 
including high-intensity multiwavel eng th anomalous diffraction H 21] and high-intensity 
radiation damage induced phasing 22]. The HIP techniques require detailed description of 
deep-inner-shell decay dynamics of heavy atoms influenced by the molecular environment. 
Therefore, understanding of radiation damage dynamics is the key for successful molecular 
imaging. 

Modeling of biological macromolecules exposed to XFEL radiation was pioneered by 


the seminal work of Neutze et al. 


23]. Since then, there have been several computational 


tools to address molecular imaging problems, for example, classical molecular dynamics 


model 124 


mod 


tra 


el 


25], particle-in-ccll approach 


26 


27], transport model [28], 29|, Coulomb complex 


311 ]. Some of these methods have been recently applied to ion fragment spec- 


32] and electron spectra 33] from Cgo molecules interacting with intense x-ray pulses. 


So far, most computational methods have been based on the independent-atom model or 
the plasma model. The description of the molecular environment is omitted by assumption 
or incorporated in an ad hoc manner. When a molecule absorbs x-ray photons, inner-shell 
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multiple ionization induces fragmentation dynamics 34j. l35|. Chemical bonds are weakened 


and electrons and holes rearrange before the molecule breaks apart 


36 


38]. Detailed elec¬ 


tronic structure calculations for chemical bonding and charge rearrangement are thus crucial 
to describe molecular effects in modeling of the XFEL-matter interaction. 

The electronic response of atoms and molecules to XFEL pulses is in essence characterized 


by multiphoton multiple ionization dynamics 


39 41]. The xatom toolkit 42, 43] has been 


developed to simulate the XFEL-atom dynamical interaction and the ionization dynamics 


model has been tested with a series of experiments 


444481]. The unprecedentedly large 


number of x-ray photons delivered by an XFEL pulse induces sequential x-ray absorptions, 
creating a variety of different g-holc configurations for each charge state +q. To simulate 
ionization dynamics, one needs to calculate photoionization cross section, Auger rate, and 
fluorescence rate for each configuration and solve a set of coupled rate equations for the 


time-dependent populations of the configurations 39, |49|, [50]. The g-hole electronic config¬ 


urations are energetically highly excited, and theoretical treatment of such highly-excited 
states is challenging. Another challenge is the complexity of the ionization dynamics. Even 
for the atomic case, one must solve more than 20 million coupled rate equations for Xe L- 


shcll-initiated ionization dynamics 461 ]. To address this formida 


approach has been proposed for solving the rate equations 


43 


ole problem, a Monte-Carlo 


451 ] and sampling the most 


probable configurations 46]. In this scheme, the electronic structure is calculated for every 
single configuration selected on the fly as part of the Monte Carlo sampling. For the molec¬ 
ular case, the complexity increases even further because of the degrees of freedom associated 
with atomic motions, so the Monte Carlo approach seems to be the only way to simulate 
molecular response to an intense XFEL pulse. However, the most important question still 
remains: how to describe the electronic structure of molecules on the fly for the Monte Carlo 
method? 

Here we present a new efficient method for electronic structure calculation of polyatomic 
molecules and implement a dedicated toolkit, xmolecule. The proposed method is well 
suited for calculations of molecular multiple-hole configurations that are formed during x- 
ray multiphoton ionization dynamics. To efficiently describe molecular orbitals of core-hole 
configurations, the method employs atomic orbitals as basis functions that are numerical 
solutions of atomic core-hole states, calculated by XATOM [42]. For any given molecular 
electronic configuration and any given molecular geometry, XMOLECULE calculates molecu- 
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lar orbitals and orbital energies, which are essential components for dynamical simulations 
of x-ray multiphoton multiple ionization. We demonstrate that xmolecule is capable to 
calculate the whole spectrum of multiple-hole configurations at a given molecular geome¬ 
try and potential energy surfaces for given multiple-hole configurations of molecules. Also 
performance scalability with the system size is discussed. In this paper, we focus on the 
implementation of a molecular electronic-structure approach. Calculating cross sections and 
rates and solving coupled rate equations to simulate ionization dynamics will be described 
elsewhere. Having achieved these results, XMOLECULE aims to play a key role in molecular 
imaging at high x-ray intensity. 

The paper is organized as follows. Section [TT] formulates our scheme to calculate molecu¬ 
lar multiple-hole configurations. It includes theoretical and computational schemes for basis 
function generation with numerical atomic orbitals, multicenter integration on a molecular 
grid, and direct Coulomb integral evaluation. In Sec. IIIII we show benchmark calculations 
for XMOLECULE, and then numerical results for the potential energy curves of various elec¬ 
tronic configurations of carbon monoxide, and single- and double-core ionization potentials 
of several polyatomic molecules. We discuss the scalability of our scheme to a molecular size 
of hundreds of atoms. This is followed by the conclusion in Sec. IIVI 

II. COMPUTATIONAL METHODS 

A. The Hartree-Fock-Slater method 

We consider a molecular system composed of iV atom atoms with Wiec electrons. The Ath 
nuclear charge and coordinates are denoted by Za and R^, respectively. The molecular 
charge state +q is given by q = Za — IV e i ec . We use the Hartree-Fock-Slater (HFS) 
method in which molecular orbitals (MO), ^(r), and orbital energies, are obtained 
by solving the effective single-electron Schrodinger equation (atomic units are used imless 
specified otherwise), 


-^V 2 + Uext(r) + Vff(r) + V x (r) ^(r) = £^(r), 


( 1 ) 


where Wxt(r) is the external potential due to the nuclei, 



( 2 ) 
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and the Hartree potential Vtf(r) represents the classical Coulomb interaction among the 
electrons, 


Mr) 



(3) 


and the last term Vx(r) represents the exchange interaction, which is approximated by the 


Slater exchange potential 


m, 


Vx{r) 


3 

2 



(4) 


The electronic density p( r) is obtained by the sum of squared MO’s weighted by the occu¬ 
pation numbers { n t } as 

p( r ) = J2 ni i^( r )i 2 ’ ( 5 ) 

i 

where n* G {0,1,2}. In contrast to conventional ground-state electronic structure calcu¬ 
lations, in which the IV e ie C spin-orbitals with the lowest energies are filled, we consider all 
possible {rti} subject to Yli n i = -^eieo in order to take account of electronic excited states 
representing g-holc configurations. 

The total energy within the HFS method is given by the sum of the nucleus-nucleus 
repulsion energy and the electronic energy, 


fhotal 


E 

A<B 


ZaZ 


A^B 


R.4 — R« 


+E ; 


1 

2 



p(r)p(r') 
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B. Linear combination of numerical atomic orbitals 


For atomic systems, the orbital is represented with spherical harmonics as 


4>nim{ r) = — r Y lm (0,<p), 


(7) 


where n, l, and m are the principal quantum number, the orbital angular momentum 
quantum number, and the associated projection quantum number, respectively. The ra¬ 
dial wavefunction u n i(r) can be solved by a numerical grid-based method. The xatom 


toolkit 


42] has been developed to solve 


he atomic HFS equation. By employing the gener- 


53) and imposing a spherically symmetric potential, 


alized pseudospectral (GPS) method [52 
xatom accurately calculates u n i(r) for a given (n, /)-subshcll, and accordingly for a 

given /i = (n,l,m). This numerical atomic orbital has been used to successfully calculate 
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FIG. 1. Numerical atomic orbitals for different core-hole states of the nitrogen atom. 


multiple-hole co nfig uration formed during x-ray multiphoton ionization dynamics in the 
atomic case 


41 


42] 


For molecular systems, we employ the linear combination of atomic orbitals (LCAO) 
scheme to construct molecular orbitals, 




( 8 ) 


where is the /ith atomic orbital (AO) and C pt is the coefficient of the /ith AO for the ith 
MO. Using Eq. (jHJ) transforms the self consistent field (SCF) Eq. (0Q) into the corresponding 
Roothaan-Hall equation [54], 

HC = SCE. (9) 

where E is a diagonal matrix of MO energies and C is the MO coefficient matrix. The 
elements of the Hamiltonian matrix H and the overlap matrix S are given as 

1 , 


H, lu = I d 3 r0 /i (r) 

= / d 3 r 0 M (r)^(r), 


2 V + Kff(r) 


M r ), 


( 10 ) 

(ii) 


where the effective potential U e ff( r ) = Kxt( r ) + Vh( r) + VA(r). Equation (jUJ) is solved in 
a self-consistent manner. To accelerate convergence we employ the direct inversion in the 
iterative subspace (DIIS) method 55|, [56]. When we encounter convergence problems at 
large bond distances, where the energy gap between the highest occupied valence orbital 


(HOMO) 


shifts 


and the lowest unoccupied virtual orbital (LUMO) is very small, we apply level 


57] in the SCF iterations. 
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Here our choice of basis set for the LCAO scheme is the numerical atomic orbitals (NAO) 
obtained by xatom described above. In Fig. [fj we plot the squared radial function \u n i(r)\ 2 
for the Is, 2s, and 2 p orbitals of the ground state of the neutral nitrogen (N) atom, the single- 
core-hole (SCH) state of N + , and the double-core-hole (DCH) state of N 2+ , respectively. 
Comparison among different core-hole states shows significant deformation of valence orbitals 
in states with core holes. To cover these effects efficiently in the molecular calculation, we 
use NAOs that are numerical solutions of the corresponding atomic core-hole states. For 
instance, N 2+ with one core hole at each atomic site (a DCH state) is calculated with basis 
functions optimized for N + (ls^ 1 ) on both N atoms, whereas N^ + with a single-site DCH 
state is calculated with basis functions optimized for N 2+ (ls -2 ) on which the core hole is 
located and basis functions optimized for neutral N on the other side. In this way, we expect 
core-hole MOs are well described by core-hole-adapted NAOs. 

To achieve utmost efficiency towards complex ionization dynamics, we employ the mini¬ 
mal basis set. Each AO with (n, l, m ) in Eq. (J7J) corresponds to a single basis function. Fully 
or partially occupied (n, Z)-subshclls contribute to a set of basis functions and each l gives 
(2 1 + 1) basis functions (|m| < l). For example, the N atom has Is, 2s, and 2 p (partially) 
occupied subshells, which constitute 5 basis functions (0i s , 02 S , 02 Px , 4>2p y , and 02 P J in total. 
This basis set is denoted as [2slp]. According to the minimal-basis-set scheme, the chemical 
elements from B to Ne have the same number of basis functions (A5,asis=5). In Section fill A1 
we will discuss limitations and extensions of the minimal-basis-set scheme. 


C. Molecular grid and multicenter integration 


Equations (fTOl) and (fTTT) require evaluation of the corresponding integrals in three di¬ 
mensions. In our case, the 0 M (r) and 0*,(r) are represented with a radial grid and spherical 
harmonics. To perform 3D integrals involving many atomic centers, we employ the mul¬ 


ticenter integration proposed by Becke 


581 ]. Molecular grid points are constructed as a 


combination of sets of atomic grid points. Each set of atomic grid points, centered at 
one of the nuclei, consists of N r radial grid points and iV ang angular grid points. The ra¬ 
dial grid points are exactly the same as those used for NAO calculations with the GPS 
method |52L 1531 ]. The angular grid points are obtained by the Lebedev grid scheme 59] with 


an angular momentum cutoff at Z max . The number of angular grid points is approximately 
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given by 7V ang « 4(Z max +1) 2 /3. A detailed description of constructing multicenter molecular 


grid points is found in Refs. 60,[6l|]. We use an atomic radial grid size (r max ) large enough 
(~10 A) so that the atomic grids of many neighboring atoms overlap with each other. In 
principle, different atomic grid parameters can be used for individual atoms in a molecule. 
For convenience, however, we use the same grid parameters for all atoms. Then the total 
number of molecular grid points is given by Wrid = Wtom X N r x iV a ng- 


Becke’s multicenter integration scheme [58] introduces a set of smooth nuclear weight 
functions {^(r)}, subject to the constraint J2 A w A (r) = 1- The nuclear weight functions 


are generated by the third-order polynomial cutoff profile in the fuzzy cell scheme 


. Then 


any integral of a given function / can be evaluated by the sum of individual atomic integrals, 


1=1 d 3 rf(r) = Y^ f d 3 r f(r)w A (r) « ^ f d 3 r A f(r A )w A (r A ), 

J A J A J 


( 12 ) 


where = r — R^. Each atomic integral can be readily performed using the spherical 
coordinate system of r^, centered at the Ath atom, 


Ia= d 3 rj/(rj)to^(rj) ss ^/( r t )w A (rt)w k , 


(13) 


keA 


where k is the index of the grid points of the Ath atom anc 


of the radial Legendre-Gauss-Lobatto quadrature weights 62 
quadrature weights [59 ]. 


Wk is defined as a product 


63] and the angular Lebedev 


D. Implementation of direct Coulomb integrals 


In electronic structure calculations, one of the most time-consuming parts is the evalu¬ 
ation of electron repulsion integrals. In order to achieve fast calculation within a desired 
accuracy, we have developed a multipole expansion scheme with an adaptive cut off. First 
the integral involved in the Hartree potential in Eq. ([3]) can be decomposed into individual 
atomic integrals, 


V H (r) = / d 3 r' 


p(r') 


E 


d 3 r' A 


Pa{ r A 


I r A 


j i ’ 


(14) 


where Pa( r) = p(r)tCyi(r). Each single-center density p A (r) can then be regarded as the 
atomic contribution to the total electronic density. To implement the integral we expand 


















the single-center density with real spherical harmonics Y[ m (0,<p) as 

^max l 

Pa( r A ) = pL( r A)Ylm{6 A ,VA), (15) 

1=0 m=—l 

where p^ n (r) is the (l, m)-component of the spherical expansion, 

/»27T /»7T 

pf m ( r A) = dip A / dO A sin 0 A p A (r A )Y lm (6 A ,ip A ). (16) 

Jo Jo 

With this single-center decomposition and spherical harmonic expansion of the electronic 
density, /^(r), the Hartree potential in Eq. (j3J) is obtained as 

Mr) WV V£(r A )Y lm (8 A , VA ). (17) 

A l,m 

where is given by 

V imt r A) = xt— / dr^r'/^pf^r'A), (18) 

U + 1 Jo rJJ 

where r< = min^r^, r A ) and r> = ma x(r' A ,r A ). This radial integral is numerically evaluated 
in combination with various truncation methods (see the Appendix). 


E. Molecular electronic configuration 


Keeping the energetically lowest orbitals doubly occupied, the SCF procedure obtains the 
HFS solution for the electronic ground state. In order to obtain a solution for an excited 
electronic state of a g-hole configuration, each molecular orbital has to be assigned a specific 
occupation number. This can be done, as in the ground state calculation, by identifying the 
orbitals by their HFS energy eigenvalue. However, during the SCF iterations the energetic 
order of MOs may change. Thus, identifying the orbitals by ordering them according to their 
HFS energy eigenvalue may lead to failure of the above SCF procedure or yield a solution 
for a different electronic state than required. This is called variational collapse 64- 661. 


To prevent this situation, we employ a variant of the maximum overlap method 


66]. In 


the maximum overlap method, the desired excited electronic state is specified by a set of 
initial guess orbitals {t/^ 11688 } in combination with a set of occupation numbers {rij}. In each 
SCF iteration, the occupation number n* of the calculated orbital -0* is chosen according to 
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its projection onto the subspace spanned by the guess orbitals with respective occupation 
number. Specifically, we calculate the overlap of the ?'th current MO with the jth guess MO, 

Oy = <*!</?“} = E s„„c?r, («) 

where S /w = fd 3 r 0 /i (r)0® uess (r). Note that the basis set for the initial guess orbitals is 
not necessarily the same as the one used for the expansion of the actual molecular orbitals, 
because different NAOs can be used for different g-liole configurations. Therefore, S /JM can 
be different from the overlap matrix S )W defined in Eq. (HID . Then, the projections of the fill 
orbital into the span of the guess orbitals for the unoccupied (n=0), singly occupied (n=l), 
and doubly occupied (n=2) cases are given by 

Pt ] = V |O,/ , (20) 

j 

where j runs over all initial guess orbitals whose occupation number rij equals n. To pre¬ 
serve the character of the required electronic configuration during the SCF procedure, we 
choose the set of the occupation numbers of the current orbitals, {nj}, such that 
is maximized, while the total number of doubly and singly occupied orbitals is maintained. 

This procedure to determine the orbital occupation critically depends on the initial guess 
MOs. Thus, it is essential that the provided guess MOs together with the provided 
occupation numbers {%} describe a wavefunction that is close to the required solution. 
For the calculations performed here, we choose initial guess MOs obtained from a previous 
calculation for a lower ionized electronic state or for the same electronic state with an altered 
molecule geometry. For the single-core-hole state in N 2 , we obtain a localized core hole on 
a specific nucleus by performing a Boys-orbital-localization procedure (gtI of the two guess 
core orbitals. Having obtained a converged solution, we verify that the obtained set of MOs 
is indeed close to the initial guess, by inspecting the individual overlap O VJ . 


III. RESULTS AND DISCUSSION 

A. Benchmark calculations 

We first estimate the accuracy of our calculations using the numerical multicenter integra¬ 
tion in comparison with conventional calculations using the analytic Gaussian integration by 


10 


GAMESS 


68|. Here we employ the 6-31G Gaussian basis set [69] to calculate the SCF-level 


ground-state energy of a water molecule. The internuclear distance of i?(OH) = 0.957 A and 
the bond angle of Z(HOH) = 104.48° are used. Only in this test we employ the restricted 
Hartree-Fock (RHF) method instead of the HFS method, in order to directly compare with 
the GAMESS results. Figure [ 2 ] shows that our numerical calculations converge to the GAMESS 
results as the number of radial grid points per atom ( N r ) and the number of angular grid 
points per atom (determined by / max ) are increased. The total number of molecular grid 
points for N r =50 and / m ax=8 is 3 x 50 x 110 = 16500. The maximum radius r max =20 a.u. 


and the GPS mapping parameter 


52 


53) L =1 a.u. are used. Note that all grid parameters 


utilized here provide a numerical accuracy |AI7| <1.5 eV. If chemical accuracy is required 
(typically 1 kcal/mol ^ 0.04 eV), our study for the water molecule shows that it is achievable 
with N r > 200 and Z max >11, keeping the same L and r max . As to be shown in Sec. IIIIB1 
the energy scale of x-ray-induced dynamics of highly-charged molecules will extend into the 
keV regime. Therefore, the worst grid parameters (for example, N r =30 and Z max =4) shown 
in Fig. [2] would be sufficient to describe the molecular ionization dynamics at high x-ray 
intensity. 

We next examine the performance of our NAO basis set scheme. In Fig. |3l we show the 
calculated HFS energy of (a) the ground state of neutral N 2 molecule with NAOs optimized 
for neutral N atom and (b) the quadruple-core-hole (QCH) state of N^ -1 " ion with NAOs op¬ 
timized for the DCH state of N 2+ . The internuclear distance i?=1.096 A is fixed. N r = 200, 
L — 1 a.u., r max = 20 a.u., and Z max = 11 are used. The results are shown together with those 
obtained by the equivalent calculations using conventional Gaussian-type-orbital (GTO) ba¬ 
sis sets of different sizes (STO-3G 70] and a series of Dunning’s correlation-consistent basis 


sets [71]; All GTO basis sets are obtained from the EMSL Basis Set Library [72)). A E is the 
energy difference from the total energy calculated with the uncontracted version of cc-pV6Z, 
[16sl0p5d4f3g2hli] with 161 basis functions, which is considered here the complete basis set 
limit. Thus A E indicates the numerical error due to lack of basis functions. 

In both Figs. [3](a) and (b), one can see that the minimal NAO basis set is superior to 
the conventional minimal basis set of STO-3G, illustrating that fully optimized NAOs are 
a practical choice for the basis set in the LCAO scheme. Also Fig. [3] shows convergency of 
GTOs with respect to the number of basis functions. Interestingly, the conventional GTOs 
for QCH N^ 4 " perform almost one order of magnitude less accurate than GTOs for neutral 
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FIG. 2. Convergence of the total HF energy with respect to the number of grid points. Ay is 
the number of radial grid points and Z max controls the number of angular grid points per atom. 
The ground-state energy calculation of H 2 O with RHF/6-31G is performed using the numerical 
multicenter integration, and A E is the difference from the result obtained using the analytic 
Gaussian integrals. 




(a) neutral N 2 


(b) quadruple-core-hole Nj 


FIG. 3. Comparison of convergency in total energy with respect to the number of basis functions, 


using the GTO scheme and the NAO scheme: (a) neutral N 2 and (b) QCH Ng . A E is defined by 


the total energy difference from the complete basis set limit (see text). 
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N 2 . The reason is that GTOs are optimized to be used for neutral ground-state calculations. 
In contrast, NAOs optimized for corresponding atomic g-hole configuration provide similar 
accuracy for both neutral N 2 and QCH Ng • Thus NAO functions provide an ideal basis set 
for our minimal-basis-set HFS scheme. 

To improve accuracy, we try to increase the number of NAOs in a systematic manner 
by including unoccupied atomic orbitals with higher (n, /) such as 3s, 3p, and so on. As 
shown in Fig. [3l the NAOs are somewhat inefficient to achieve higher accuracy by simply 
extending to higher (n, /), as previously reported in Ref. [73]. This is attributed to the 
fact that additional series of higher (n, /)-orbitals, whose mean square radius is far from 
the atomic center, are inefficient for representing bonding molecular orbitals. Instead, we 
propose a scheme for adding compact p-type and d-type functions to the minimal NAO 
basis set in order to improve the description of chemical bonding. Additional functions are 
constructed by use of radial wavefunctions of occupied subshells multiplied by r, where r 
is the radial coordinate in the atomic system. For the chemical elements from B to Ne, 
the p-type functions are U 2 S (r)Yi m ( 6 , <p), where m = 0,±1, and the d-type functions are 
U 2 p{r)Y 2 m ( 0 , <p), where m = 0,±1,±2. By adding these functions, as denoted by extended 
NAO (NAO[e]) and as marked with the black rectangle in Fig. [3] the accuracy is much 
improved; the total energy of neutral N 2 is close to the cc-pVDZ level and the total energy 
of QCH N^ 4- is close to the cc-pVQZ level. The number of basis functions for NAO[e] is only 
13 per atom, whereas cc-pVQZ has 55 basis functions. There have been several approaches 
for extension of the minimal NAO basis set 3, 74], where additional basis functions are 
constructed in a schematic way. 


B. Potential energy curves for various hole configurations 

Figure [4] shows the HFS total energies in Eq. (jHJ) using core-hole-adapted NAO basis 
functions for all possible g-hole configurations that can be accessed by x-ray multiphoton 
ionization of the neutral carbon monoxide molecule. The internuclear distance i?=1.128 A is 
fixed, and the grid parameters of AA =50, L —1 a.u., r max =20 a.u., and l max =8 are used. For 
convenience, the figure shows these configurations grouped into charge states. The lowest 
horizontal line for each charge state indicates the ground-state energy for a given charge +q. 
This figure then illustrates how much energetically excited the g-hole configurations are. For 
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FIG. 4. Spectrum of total energies for various electronic states of CO, which are accessible by x- 
ray multiphoton ionization. The colors indicate different core hole configurations. Potential energy 
curves of the DCH states inside the box will be shown in Fig. [5] 


example, the energy of DCH C0 2+ (Ols 2 ) is about 1 keV higher than the ground-state 
energy of C0 2+ . Ionization dynamics induced by intense x-ray pulses may occur step by 
step, visiting lots of these electronic states. Therefore it is crucial to efficiently calculate this 
set of electronic states of g-hole configurations. 

We further investigate the behavior of the potential energy curves (PEC) obtained using 
the NAO basis set. In Fig. 0 we show the calculated HFS total energies for three different 
types of C0 2+ DCH states: (a) Cls" 2 , (b) Cls -1 01s -1 , and (c) Ols" 2 . The solid red line 
indicates PECs calculated with core-hole-adapted NAO[e], The dashed red line indicates 
PECs with core-hole-adapted NAO without additional functions. Both results are com¬ 
pared with the solid blue line calculated with the conventional cc-pVTZ basis set. Previous 
theoretical studies of core-hole states suggested that calculations of the cc-pVTZ level are 


reasonably converged 


75 


76]. Our NAO[e] scheme reproduces well the cc-pVTZ results, even 


though the size of NAO[e] (Ab as i S =13) is much smaller than that of cc-pVTZ (Ab as i S =30). 
For comparison, we also plot PECs with NAOs optimized for neutral ground-state atoms, 
denoted by NAOfn], which shows a trend similar to what a conventional ST0-3G minimal 
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FIG. 5. Potential energy curves of C0 2+ double-core-hole states as a function of the internuclear 
distance R. Energy is given relative to the ground-state energy of neutral CO. 

basis set would be. The NAO[n] results represent a poor estimate of PECs due to missing 
the core-hole effect on orbitals. On the other hand, PECs from NAOs, which are optimized 
for atomic core-hole states, show dramatic improvement over NAO[n], even though NAO 
and NAO[n] have the same number of basis functions (N hasis =5). 


C. Single- and double-core ionization potentials of molecules 


To further test the accuracy of our calculation scheme we compare core ionization po¬ 


tentials for a series of small mo 
geometries are taken from Ref. 


ecules obtained from the HFS calculation. The molecular 


77] and the grid parameters are the same as those used in 


Sec. 1III Bl We derive the single-core ionization potential from the HFS orbital energy of a 
neutral ground-state calculation using NAOs with and without additional basis functions. 
The double-core ionization potential is calculated as the sum of the first and the second core 
ionization potential, where the second core ionization potential is taken from the orbital 
energy of the SCH state calculation. For the DCH states with core holes on different nuclear 
sites, thus, two values are obtained for the two different ionization sequences. 

Table [I] lists the ionization potentials compared with the values obtained from complete- 
active-space SCF (CASSCF) calculations 75J and experimental values 78H83j|. For our 


calculations of two-site DCH states, a mean value and a deviation are listed for two values 
from the different ionization sequences. For the CASSCF results of two-site DCH states, 
only triplet spin states are listed and the difference between singlet and triplet states is 
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smaller than 0.7 eV. Note that N2O is a linear molecule N*—N c —O, where N* indicates the 
terminal N atom and N c means N at the center. As can be seen, the CASSCF results [75I ] 
show agreement within less than 4 eV with the available experimental values. The single 
ionization potentials we extract from the much simpler HFS calculation using the minimal 


NAO basis set show for all molecules a similar agreement within 5.1 eV, except FIs 


-1 


111 


LiF (28 eV). For the DCH states, where the core holes are located on different nuclear sites, 
with the minimal NAO basis set we also see a similar agreement within 7 eV to the CASSCF 
values and, where available, the experimental values. Again, LiF is an exception showing a 
much larger discrepancy of ~ 30 eV. For the DCH state with core holes on the same nucleus 
we find a systematically larger disagreement of about 20-30 eV (for Fls~ 2 in LiF 78 eV). 

The inclusion of the p-type and d-type functions in the basis set leads in most cases to a 
larger deviation to the literature values than the results obtained with the minimal basis set. 
For these calculations we get ionization potentials that tend to be lower than the literature 
values (from 3.4 eV for Lils^ 1 to 60.5 eV for Fls~ 2 in LiF). Clearly, the extended NAO 
basis set should improve the quality of the electronic structure model, as the electronic wave 
function has more flexibility. Thus, we conclude that the good agreement with the minimal 
basis set might be an artifact due to cancellation of errors. 

For the results obtained with the larger basis set, we attribute the remaining deviations 
to the literature values mainly to relaxation energy contributions associated with the core 
hole electron removal. The applied scheme of taking orbital energies as ionization potentials 
cannot account for these effects. For core holes on the same nuclear site, where the core 
hole relaxation contributions are particularly strong, we see the strongest deviations (18.2- 
60.5 eV). Also, the extreme deviations for LiF may be explained from these contributions: 
The core hole on the F atom in LiF shows a particular large core hole relaxation effect, 
whereas for the core hole on Li it is very small 75]. 


D. Performance scaling 

Our implementation of XMOLECULE aims for large-scale molecular calculations, espe¬ 
cially for a large number of repeated calculations where time and resources available for 
each calculation are severely limited. At the same time, it requires the capability of calcu¬ 
lating a moderate-size systems in order to describe molecular-environment effects. Here we 
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demonstrate the performance scalability of our scheme toward molecular calculations with 
a few hundred atoms. O ur g rid-based method has the potential to achieve linear scaling in 


the number of atoms 


85M88], 


In the HFS method, the two-body interaction is divided into the exchange interaction and 
the direct Coulomb interaction. The former is replaced with the local density approximation, 
and the latter is treated with the Hartree potential as described in Sec. Ill Dl The computa¬ 
tional complexity of the Hartree potential is 0(N^ lid ), where iVg ri d is linearly proportional to 
^atom, because the potential Vh{x) in Eq. ([3]) contains the integral over molecular grid points 
and has to be evaluated at every single molecular grid point. By introducing the truncation 
methods described in the Appendix, this complexity can be reduced to O (Ng^N^™). These 
truncation schemes do not change the quadratic scaling behavior with respect to IV atom , but 
reduce the actual computational time by several times (for example, a factor of two in our 
following calculations). 

Another truncation can be made in the evaluation of one-body matrix elements in 
Eqs. (TTOT) and ((Till . Both H ^ and S^ v are decomposed into atomic contributions by the mul¬ 
ticenter integration: H^ m H'X and S IW « J2 A S^ U . We define an AO pair 0 M (r)0„(r) 
and its contribution to each atomic grid, 


Qiu= / d?r A \<l>„(r A )<l> l ,(r A )\wA(r A ). 


( 21 ) 


Then we set H^ u and S^ v to zero if Qf LV < e, where e is a control parameter. The complexity 
of the integrals in Eqs. m and CD is 0(AgasisATgrid), where both N hasis and IV grid are 
linearly proportional to IV atom ■ By using our truncation scheme described above, we can 
reduce it to a quadratic behavior with respect to IVatom- 

Figure [6] shows the size dependence of the computation time of XMOLECULE with the cur¬ 
rent truncation schemes. We calculate the HFS ground state of C 2 4 H 12 molecule (coronene) 
in its equilibrium molecular geometry taken from Ref. 7 t|. And we perform calculations for 
n such molecules (n— 1,..., 7) stacked in the vertical direction with an interlayer separation 
of 3.3 A. The minimal NAO basis set is used with N r —20 , L=1 a.u., r max =10 a.u., and 
Z m ax=4. The y axis is the CPU time per SCF iteration in seconds on a lab workstation 
(Intel Xeon X5660 2.80 GHz), and the x axis indicates the number of atoms in the stacked 
(C 24 H 12 ) n molecule. When all truncations are off (blue curve), the computational perfor¬ 
mance shows close to a cubic dependence. On the other hand, when the truncation method 
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Number of atoms 

FIG. 6. Performance scaling with respect to the molecular size. The y axis is the CPU time per 
SCF iteration in seconds, and the x axis is the number of atoms in stacked (C 24 Hi 2 ) n molecules. 
The dotted lines with 0(N% tom ) and 0(JVf tom ) indicate a quadratic behavior and a cubic behavior, 
respectively, with respect to the number of atoms, IV a tom- 

of Eq. (|2TT) is applied with £ = 1CT 3 (red curve), the scaling shows a quadratic dependence 
on the system size. Note that when the truncation of Eq. (I2TT) is used, the complexity of the 
matrix element calculations is reduced to a quadratic relation, while the Hartree potential 
calculation becomes the most time-consuming step, which is also governed by a quadratic 
scaling. The difference in the total energy between the calculations with and without this 
truncation is less than 0.14 eV/atom, whereas the truncated calculation is about 7.5 times 
faster than the calculation with no truncation. The calculation with 216 atoms (n= 6) takes 
40 seconds per single SCF iteration on the lab workstation. The whole computation time 
takes about 14 minutes including the overhead costs for numerical grid construction and 12 
SCF iterations. When additional truncation schemes for the Hartree potential (see the Ap¬ 
pendix) are applied with £ 0 =0.1 and £i=0.01 (green curve), the complexity is a bit reduced 
towards a linear relation and the errors in the total energy are less than 0.93 eV/atom. The 
actual computational time per iteration is improved by a factor of two for the 216-atom 
case. 
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IV. CONCLUSION 


In summary, we present a new method to calculate various multiple-hole electronic states 
for polyatomic molecules that may be formed by x-ray multiphoton ionization dynamics at 
high x-ray intensity. The method is based on the Hartree-Fock-Slater method, employing 
the linear combination of atomic orbitals (LCAO) scheme, where numerical atomic orbitals 
(NAO) are used as a minimal basis set for molecular orbital calculations. Usage of NAOs 
has two advantages over conventional Gaussian-type basis functions. First, NAOs are ob¬ 
tained from numerical solutions for atomic core-hole states at the same computational level. 
Second, accuracy and efficiency of numerical integration with NAOs are controllable by grid 
parameters and truncation schemes. The NAOs presented here are accurately solved by 
using the numerical grid-based method that is implemented in the XATOM toolkit. 

Using core-hole-adapted NAOs, molecular orbitals for core-hole states are efficiently cal¬ 
culated. We present benchmark calculations for multiple-core-hole states of N 2 . The NAO 
results show consistent accuracy for different charge states, which is not the case for conven¬ 
tional basis sets that are optimized for neutral systems. We demonstrate that our scheme 
is able to calculate all possible configurations that may be formed by removing zero, one or 
more electrons from the ground-state configuration of neutral CO molecule. The electronic 
state during x-ray multiphoton ionization dynamics may visit several of these multiple-hole 
configurations, which are energetically excited by about 4 keV with respect to the ground- 
state configuration of neutral CO. For molecular and ionization dynamics during XFEL 
pulses, we need not only all different multiple-hole states but also potential energy surfaces 
for individual electronic states. For double-core-hole states of C0 2+ , we calculate poten¬ 
tial energy curves with core-hole-adapted NAOs, in good agreement with converged results 
with respect to the basis-set size. Also we present single- and double-core-hole ionization 
potentials for several molecules in comparison with available theoretical and experimental 
data. 

Efficient electronic structure calculations for molecules are essential for dynamical mod¬ 
eling of molecules at high x-ray intensity. We have implemented xmolecule to make a 
step toward dynamical simulation of molecular imaging with XFELs. Calculations of pho¬ 
toionization cross sections, fluorescence rates, and Auger rates for all possible configurations 
formed during molecular ionization dynamics are in progress. 
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APPENDIX 


Here, we introduce truncation schemes on The upper bound of V t A I is given by 
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Then we define 


(] A — 
a lm ~ 


dr' A r' A \pf m (r' A )\ , 


(A.22) 


(A.23) 


to be used as a truncation indicator. Note that the number of electrons in the Ath atomic 
electronic density is given by Q A = fd 3 rp A ( r) = \/4Trd A 0 . Within the atom we consider 
higher multipole moments of the density to be less relevant. Thus, if d ^ is small enough in 


comparison with d A Ql then the contribution of l and m is truncated, i.e., 


'A 


v ,l(r A ) 0 when 

a oo 


(A.24) 


where £i is a truncation control parameter. 

Another truncation is that if the distance from the origin of the Atli atom is large enough, 
the Hartree potential contributed from A is approximately evaluated by the monopole only 
and all / > 0 contributions are truncated, i.e., 


v im( r A) 0 when r A > r c , 


(A.25) 
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where r c is a cut-off radius given by r c = Qa/z o = \/47r^oo/ £ o- Here £ 0 is another truncation 
control parameter. 
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TABLE I. Single core hole and double core hole ionization potentials in eV. The molecular geome¬ 
tries are taken from Ref. [tt!]. 

Molecule 

Configuration 

Present (NAO) 

Present (NAO[e]) CASSCF [75] 

Exp. 

CO 

Ols" 1 

537.43 

533.80 

542.82 

542.5“ 


Cls -1 

295.81 

289.84 

296.36 

296.5 b 


01s" 2 

1139.12 

1136.43 

1176.56 



Cls" 2 

647.50 

636.89 

664.42 

667.9 b 


Cls^Ols" 1 

850.70T2.28 

840.34i0.52 

855.20 

855.3 b 

LiF 

FIs -1 

663.77 

670.23 

688.04 

691.8 C 


Lils -1 

59.34 

58.56 

65.33 

61. 9 d 


Fls“ 2 

1403.81 

1420.99 

1481.50 



Lils" 2 

154.84 

153.19 

172.60 



Lils- 1 Fls“ 1 

735.13i2.72 

739.48il.06 

763.28 


n 2 

N V 

409.57 

403.30 

411.03 

409.9 e 


Nlu- 1 

409.54 

403.26 

410.93 



Nls -2 

878.29 

868.83 

901.16 

903.2* 


Nls^Nls^ 1 

836.96T0.02 

823.87i0.02 

836.44 


n 2 o 

01s" 1 

537.81 

534.72 

542.54 

541.4 9 


Nils' 1 

408.70 

403.66 

408.61 

409.0* 


N c ls _1 

413.74 

407.89 

412.52 

412.5 e 


01s- 2 

1138.96 

1136.23 

1173.25 



N t ls" 2 

874.42 

866.83 

893.93 



N c ls" 2 

883.76 

875.54 

902.31 



Ols-^tls -1 

961.29i0.25 

951.47i0.25 

963.27 



01s -1 N c ls -1 

964.30i0.40 

954.53i0.37 

965.62 



Ntls^NJs- 1 

836.55i0.01 

825.25i0.ll 

833.22 

834.2* 



































